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. SPECIFICATION 

Proctti for Obt»inin S ON A; RNA. Peptidos. 
Polypeptides, or Proteins by Recombinant DNA 
Technique 

5 . Th» present invention has as its object a process 
for obta.ning DNA. RNA. peptides, polypept.des of 
proteins, through use of transformed n0 »* " . * 
containing genes capable of expressing these RNA*. 
10 pept.des. polypeptides, or proteins: that is to say. oy 
utilisation of recombinant DNA technique. 

The invention aims in particular at the production 
of stochastic genes or fragments of stochastic genes 
in a fashion to permit obtaining simultaneously. 
15 after iranscription and translation of these genes. • 
very .arge number (on the order of at least 10.000) of 
completely new proteins, in the presence of nosx. 
cells (bacterial or sucaryotic) containing these genes 
respectively capable of expressing these proteins. 

• 20 and to carry out thereafter a selection or 

among thesaid clones, in prder.to determine which 
of them produce proteins with desired properties. 

for example structural, enzymatic, catalytic, 
antigenic, pharmacologic, or properties of 
25 liganding. and more generally, chemical, 
biochemical, biological, etc properties. ■ 

ThP inv-minn also has as its aim procedur es^ 
»y77T^.>.r, s of DNA or RNA witn u tmiable, 
prooiq iM "ot.wiy. pK,mir^i nircps""" 1 - ° r - 
\r\ 'biologic al orcoerties. . . 

-TriTEiear, therefore, that the invention » i open to a 
. very large number of applications in very many 
• areas of science, industry and medicine. 
• The process for production of. peptides or. 

35 polypeptides according to the invent.onjs 

characterized in that one produces simultaneously. 
' in the same medium, genes which are at least 

partially composed of synthetic stochastic 
• polynucleotides, that one introduces the genes thus 

40 obtained into host cells, that one cultivates 
■ simultaneously the independent clones of .the • 
transformed host cells containing these genes i n .. 
«uch a manner so as to clone the stochastic genes 

• and to obtain the production of the proteins 

45. expressed by-each of these stochastic genes, that 
one carries out selection and/or screening o he 
"clones of transformed host cells in a manner to 

identify those clones producing peptides or 
. po^ 

. peot.de or polypeptide having 

In a first mode of carrying out this process, 
stochastic genes are produced by stochastic. 

(onowed by (ormat.on of- ^^\ tttna o, ONA 

fashion as to lorm a stochastic ; vectof 

60 const.tuied by a molecule of e*e r " y . ndJ 

possessing r-o ''« h f ^%"" s^.hesisof the 
are compiement.'Y.followeo by » 

l« a second mode o' ca nr * 



of oligonucleotides without cohesive ends, in a 
manner :o lorm fragments of stochastic ONA, 
followed by ligaiion of ;nese fragments to a 
previously.Iinearirrd expression vector.^ 
70 The expression vector can be a plasmid. notably a 
bacteria! plasmid. Excellent results have.been 
obtained using the plasmid pUC8 as the expression 

vector.* • • i Mil * 

The expression vector can also be viral DNA or a 
75 hybrid of plasmid and viral ONA. . ' _ 

The host cells can be profcaryotic cells such as He 
101 and C 600. or eufcaryotic cells. ■ 

When utilizing the procedure according to the 
tecond mode mentioned above.lt is possible to 
80 utilize oligonucleotides which form a group of 

palindromic octamers. ; 

Particularly good results are obtained by utilizing 
the following group of palindromic octamers: 
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. 5* GGAATTCC3' 
5' GGTCGACC3 r ■ 
5'CAAGCTTG3' 
5' CCATATGG 3' - 
5' CATCGATG 3' 

h is also possible to use oligonucleotides which 
form a group of palindromic heptamers.. 

Very good results are obtained utilizing the 
following group of palindromic heptamers: 

' ■ 5'XTCGCGA3' 
S'XCTGGAGS* 
5' RGGTACC3' 



100 whereX- A.G.CbrT f andR«AorT 

According to a method to utilize these procedures 
which is particularly advantageous, one isolates ana 
, purifies the transforming DNA of the plasmids from 
acultureofindependentclonesofthetransformed > 

105 host cells obtained by following the procedures 
. - - above, then the purified DNA is cut by at least one 
restriction enzyme corresponding to SP«^ ,C . . 
' enrymatic cutting site present in the palindromic 
/ octamers or heptamers but »^ entf ^? h " ^ *. 
110' expression vector which was utilized: this cutting « 
followed Deactivation of the restnd.on enryrne. 
then one simultaneously treMs the «n«mble or 
• linearized stochastic DNA fragments thus obtained 
with T4 DNA ligase. in such.a manner tc » cr « i • 
1 15 new ensemble of DNA containing new . 
sequences/this new ensemble can therefor eontam 
« number of stochastic genes larger than the 
number of genes in the initial ensemble. One then 
utilizes this new ensemble of transforming |DNMO ■ 
1 20 transform the host cells and clone these genes, 

finally utilizes screening and/or » l « l ' 0 [ n S c n t a cfilt$ • 
isolates the new clones of transformed hos t «n* 
and finally these are cultivated to produce : attejs 
one peptide or polypeptide, for example, a new 

125 protein. for selection 

■ The property serving as the cntenon *J £ 
Of the Cones of host cells can be *. » 
peptides or polypepfdes. pr odueed b£ 9 . 
clone. ,o ««»>vse . civen ehe-d I reicl.0 



peptides and'or polypeptides, the said property can 
• be the capacity lo catalyse a sequence of reactions 
leading from an iniiial group of chemical 
compounds to at leasf one target compound. 
.5 With the aim of producing an ensemble 

• constituted by a plurality of peptides and 
polypeptides which art reflexively autocatalytic. the 
said property can be the capacity to catalyse the 
synthesis of the same ensemble from amino acids 

TO and/or oligopeptides in an appropriate milieu. 
The said proper^ can also be the capacity to 
modify selectively the biological or chemical 
properties of a given compound, for example the 
ca pacity to selectively modify the catalytic activity of 

15 a polypeptide. 

The said pro perry can also be the capacity to 
stimulate, inhibit, or modify at least one biological 
function of at least one biologically active 
compound, chosen, for example, among the 

20 hormones, neurotransmitters, adhesion factors, 

• growth factors and specific regulators of DNA 
replication and/or transcription and/or translation of 
RNA. 

JThc said property can equally b* the capacity of. 
25 the peptide or po!v?»~Tidp to hind to a given lic?and._ 
The invention also has as its object the use of the 
peptide or polypeptide obtained by the process ^ 
. specified above, for the detection and/or the titration 
.of a ligand. 

30 According to a particularly advantageous mode of 
carrying out the invention, the criterion for selection 
of the clones of transformed host cells is the 
capacity of these peptides or polypeptides to 
simulate or modify the effects of a biologically 

35 active molecule, for example, a protein, and 

screening and/or selection for clones of transformed 
host cells producing at least one peptide or 
polype ptide having this property, is carried out by 
preparing antibodies against the active molecule, ■ 

40 then utilizing these antibodies after their 

purification, to identify the clones containing this 
peptide or polypeptide, then by cultivating the 
clones thus identified, separating and purifying the 
peptide or polypeptide produced by these clones. 

45 and finally by submitting the peptide or polypeptide 
. to an in vitro assay to verify that it has the capacity 

to simulate or modrfy the effects of the said 
molecule. 

According to another mode of carrying out the 
50 process according to the inventioa.the property^ 
serving as the criterion of selection is that of having 
at least one epitope similar lo one of the epitopes of 
a given antigen. 

The invention carries over to obtaining 
55 polypeptides by the process specified above and 
utilirab'e as chemotherapeutically active 

substances. 

In particular, in the case where the said antigen is 
■ EGF. the invention permits obtaining polypeptides 
60 usable for chemotherapeutic treatment of 

epitheliomas. 

Acco'dingto a variant of the procedure, one 
identifies. andjsoiates the clones oM'ans'ormecl 
PosTceHs D'odTc'ftc peptides o;poiypep:«des 



chromato gr aphy a g ajnjs i_a n nb _p d ! e s_cpj/ e s pondmi 
to a protein expre S s e d bv th» rM.iral n zr o f t*» 
DNA hvpnd. 

For example, in the case where the natural partof 
70 the hybrid DNA contains a gene expressing p. 

gaiaaosldase. one can advantageously identify and 
isolates the said clones of transformed host cells by 
affinity ch'romatography against aniffl* 
ga'actosidase antibodies. 
75 After expression and purification of hybrid 
peptides or polypeptides, one can separate and 
isolate their novel parts. 

The invention also applies to a use of the process 
specified above for the preparation of a vaccine: the 
80 apolication is characterized by the fact that 
antibodies against the pathogenic agent are 
isolated, for example antibodies. formed after 
injection of the pathogenic agent in the body of an 
animal capable of forming antibodies against this 
85 agent, and these antibodies are used to identify the 
clones producing at least one protein haying at least 
one epitope similar to one of the eoltooes of the 
pathogenic agent, the transformed host cell 
corresponding to these clones are cultured to 
90 produce these proteins, this protein'is isolated and 
purified from the clones of cells, then this protein is 
used for the production of a vaccine against the 
pathogenic agent. 
For example in order to prepare an anti-HVB ' ; . 
95 vaccine, one can extract and purify at least one 
capside protein of the HVS virus, inject this protein 
into an animal capable of forming antibodies 
against this protein having at least one epitope 
similar to one of the epitopes of the HVB virus, then 
100 cultivate the clones of transformed host cells 
corresponding to these clones in a manner to 
produce this protein, isolate and purify the protein 
from culture of these.clones of ceils and utilize the 
protein for the production of an anti-HVB vaccine. 
105 According to an advantageous mode of carrying 
out the process according to the invention, the host 
celts consist in bacteria such as Escherichia cofi 
whose genome contains neither the natural gene 
expressing 0-ga!actosidase. nor.the E3G gene.that 
11 0 is to say. Z~. EBG" E coli. The transformed cells are 
cultured in the presence of X gal and the indicator 
IPTG in the medium, and cells positive for 0- 
galactosidase functions are detected, thereafter, the 
transforming ONA is transplanted into an , 
115. appropriate clone of host cells for large scale culture 
to produce at least one peptide or polypeptide. 
. Th» p fr T*Tj " Hf'pn, {of sE^ion 

Qf the tra n ^fnrf- h ncf ft.ll^ ra f^ Jfg^ h > 

capacity of the polypeptides or p rfltejns orp_C'*Jged 
120 by the culture of these einn»< te bind t o a given . 

compoun d. 

This compound can be In part'Cula' chosen 

advantageously among peptides, polypeptides and 

proteins, notably among proteins regulating the 
125 transcription activity of ONA. 

On Th, nth^nn rt T K * ^ftiff r^"""^ " n * H ° ° — 



chosen am ong O NA and RNA sesue^ces. 

T "ne mvenoon nas also as .ts ocject those e-rocetfts 
wh.ch are obtained in the case w*e'e the p'Ope*y 



transformed hos: ceils consist in the capacity of 
these proteins 10 bind to regulator P'°« e,ns 
controlling transcription activity of the ONA. or else 
to ONA and RNA sequences. 
5 The invention has. in addition, as an object the 
use of a protein which is obtained m the first 
pan.cula. case above mentioned, as a as '?9«t«»nr . 
sequence controlling replication o' transcription of 8 

neighboring DNA sequence. . 
10 ' Ontheotherhar.d.theaimoftheinvent.onalso 

includes utilization of proteins obtained m the 
«cond use mentioned to modify the P'°°en.« of 
transcription or replication of a sequence of DNA, in 
a cell containing the sequence of DNA. and 

15 expressing this protein. »..„,..«. „f 

The invention has as its object as well a process of 
production of DNA. ch^raeteMze fLbv si mu lia ^O US - 
production inihe same medium j . f o^nes at leas t. 

p.rri.lly rn mnnsed of Stochastic syntheti c _ 

2d polynucleotides, in that the.genes tnus obtained arr 
.ntrocuced into host cells to produce an ensemble of 



. transformed host cells, in that screening 
. selection on this ensemble is carried out to identify 

those host cells containing in their genome 
2 4 stochastic sequences of DNA having at lea* t one 
desired property, and finally. In that the ,DI£ mm 
the clones of host cells thus ident.f.ed is .to lated 

The invention 'further has as its ob.ec a P'°«° u ™ 
to produce RNA.characterized by »™*\ n '°"* 
* produaion in the same medium, of genes at least 
partially-composed of stochastic synthetic _ 
polynucleotides, in that the genes thus obtained are 
Introduced into host cells to produce an ensemble of 
transformed host cells, in that the hos cells so 

M produced are cultivated simultaneously, ana 

screening and/or selection of this «"»^' e .'* _ 
carried outinaj^ner^^ 
' ^""'"'"T tiaehastlc seouencej^f^iiayjlU^ 
- Tiasr^neliiirSSblS^^ 
Ati iioTaTedlrom the hosTcelii thus '^V^Vmdji " 

tk. ..:a rv en hf the ca 0 ac.£qpJundJ 

.^..nd. which might bejorj?Z5gfil e -&- ■. 

capacity to c v-'y" ' T vgn chemj cju rnniBn . v 

45 ca pacity to b< aj ran jlf r , RN . A - .„._,;„- w jn 

be described in more details, as we i as 
applications, with reference to non limitative . 

embodiments. . ....r,.i 

50 Firs, we she., describe ^ ^ hastIc . 
procedures to carry out the syntnes * 
' genes, and the introduction of those genes in 

bacteria to produce Cones of transformed bactena. . 

55 I) Direct Synthesis on an Expression Vector. 
aRinearizstion of the Vector _ •• K ,' 10 i» 

30 microef ams. that is. »P° rct,mJl ' * ... . 
mo.ecu.es o< the pUC8 ^'"^J Jw, ,00 
Knc.riaed byinevbat.on for 3 hours •< 3 f 
60 un„s of the Pst. restriction '^7™ 
•300 . o. .he-.ooropri-te .land ardbuKe 
line „ ile d vector it treated J«;^ v0 , uine 0 f 
the „ p,„,o.»..d - .n standard 



three hours, the linea'ired vtcto' is e!ec:.-o-e'uted, 
precipitated in ethanol. and taken up in 30 I of water. 

b) Stochastic Synthesis Using the tnryme Terminal 
70 Transferase (TdT) 

30 ug of the linearized vector are reacted for 2 
• hours at 37'C with 30 units of TdT in 300 1 of the 
appropriate buffer, in the presence of I mM dGTP. 1 
rhM dCTP, 0.3 mM dTTP and 1 mM dATP. The lower 
75 concentration of dTTP is chosen in order to reduce 
the frequency ol "stop" codons in the 
corresponding messenger UNA. A similar result, 
although somewhat less favorable, car. be obtained 
by utilizing a lower concentration for dATP than for . 
80 the other desoxynucleotide triphosphates. The 
• ■. progress of the polymerization on the 3' extremity 
of the Pstl sites is followed by analysis on a gel of. 
aliquotes taken during the course of the reaction. 

When the rnnninn_ff^l21^ r ^"' ! " amt!a ! ,Valug 
85 of 300 nucleotides adde d oer3' extremity. his 

stopped and the free nucleotides are separated from 
the polymer by differential precipitation or by . 
passage over a column containing a molecular sieve 
. juch as BiogelPSQ. After concentration by 
90 precipitation in ethanol. the polyme.res are 

subjected to a further polymerization with TdT, first 
in the presence of dATP. then in the presence of 
dTTP.These last two reactions are separated by a 
filtration on a gel and are carried out for short 
95 intervals (30 seconds to 3 minutes) in order to add 
sequentially 10— 30 Afollowed by 10-30 T to the 3 
ends of the polymers. 



cl Synthesis of the S econd Strand of Stochastic DN« 
100 Each molecule of vector possesses, at the end of 
the preceding operation, two stochastic sequences 
whose 3' ends are complementary. The mixture of 
polymers is therefore incubated in conditions 
. favoring hybridization of the complementary^ 
105 extremities (150 mM NaCl. 10 mM Tns-HCI. pH 7.6. 1 
. mMEOTA at 65* for 10 minutes, followed by 
lowering the temperature to jrC at a rate of 3 to 4 . 
• C per hour. The hybridized polymers are then 

reacted with 60 units of the large fragment IKIenow] 
110 of polymerase 1. In the presence of the four 

nucleotide triphosphates (200 mMl at « C for two 
hours. This step accomplishes the synthesis of the 
second strand from the 3' ends of the hybrid 
poIymers.The molecules which result from this _ 
1 1 5 direct synthesis starting from linearized vector are 
thereafter utilized to transform competent eeus. 

d Transformation of Competent Clones • 

: 100 to 200 ml of competent HB101 ol X600 at a 

120 concentration of 10" cells/ml. are '"~ b »" d ^ t 

the noehastic ONA prepa-.tl.on l«f° m •^.''fJJ, 

• presence of 6 mM *0»^^£-j£i 
MgCI,for30minutes atO'CAtempe » « 

3 minutes A 37«C is ^^^^f . 
125 followed by the add.t.on of 400 to suu 

culture med.um. without iniib.oi.es. i ^ 
transformed culture is incubated^ J Qf Kry 

minutes, then diluted to 10 hires t)Y »° ( . ft Aftef 

med.um coining 'Ovl'™** 9 ™ ^..Ced 
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culture is centrifuged. end the peMet of transformed 
cells is lyophilysed and stored at -70*C. Such a 
culture contains 3x 10 T to 10 1 independent 
transformants, each containing a unique stochastic 
5 gene inserted into the expression vector. 

II) Synthesis of Stochastic Genes Starting From 
Oligonucleotides Without Cohesive Ends. 
This procedure is based on the fact that 

10 polymerization of judiciously chosen palindromic * 
oligonucleotides permits construction of stochastic 
genes which have no "stop" codon in my of the six 
possible reading frames, while at the same time 
assuring a balanced representation of triplets ■ 

15 specifying all amino acids. Further, and to avoid a 
. repotition of sequence motifs in the proteins which 
result, the oligonucleotides can contain a number of 
bases which is not a multiple of true**. The example 
which follows descHbes the use of one of the 

20 possible combinations which fulfil these criteria: 

a) Choice of a Group of Octamers 
The group of oligonucleotides following: 
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5' GGAATTCC3' 
5' GGTCGACC3' 
5* CMGCTTG 3' 
5' CCATATGG 3' 
5' CATCGATG 3' 



is composed of 5 palindromes (thus self 
complementary sequences] where it is easy to verify 
. that their stochastic polymerization does not 
generate any "stop" codons, and specifies all the 
35 amino acids. 

Obviously, one can utilize other group of 
palindromic octamers which do not generate any 
. "atop" codons and specify all the amino acids found 
in polypeptides. CIeariy. it is also possible to utilize 
40 non palindromic groups of octamerx. or other 
oligomers, under the condition that their 
complements forming double stranded DNA am 
also used. 

.,45 b) Assembly of a Stochastic Gene From a Group of 
Octamers 

A mixture containing 5 pg each of the 
oligonucleotides indicated above (previously 
. . phosphoryiatad at the 5* position by • standard ^ 
50 procedure) is reacted in a 100 ul volume containing 
1 mM ATP, 10% polyethyleneglycol. and 100 units 
ofT4 DNAIigaseinthe appropriate buffer at 13*C for 
aix hours. This step carries out the stochastic 
polymerization of the oligomers in the double 
. 55 stranded state and without cohesive ends. The 
resulting polymers' are isolated by passage over a 
molecular sieve (Biog el P601 recovering those with 
20 to 1 00 oligomers. After concentration, this 
fraction is again submitted to catalysis or 
60 polymerization by.T4 DNA ligase under the , 
conditions described above. Thereafter, as. 
described above, these polymers which have 
assembled at least 100 oligomers are isolated. 



cl Preparation Of The Host Plasmid 

The pUC8 expression vector is linearized by Smal 
enryme in the appropriate buffer, as described 
above. The vector linearized by Smal does not have 

70 cohesive ends. Thus the Knearired vector is treated 
by calf intestine alfcatine phosphatase ICIPJ at a level 
of one unit per microgram of vector in the 
appproprtate buffer, at 37*C for 30 minutes. The C1P 
enryme is thereafter inactivated by two successive 

75 extractions with phenol-chloroform. The linearized 
and dephosphorylated vector is precipitated in 
erthano!, then redissolved in water at 1 mg/ml. ' 

d) Ligation of Stochastic Genes To The Vector 
80 fc Equimolar quantities of vector and polymers are 
mixed and incubated in the presence of 1000 units 
ofT4 0NAligase. 1 mM ATP.10% polyethylene 
glycol, in the appropriate buffer, for 12 hours at 
t3*C. This step ligates the stochastic polymers in the 
85 expression vector and forms double stranded 
circular molecules which are. therefore, capable of 
transforming. 

Transformation of Competent Clones 
90 Transformation of competent clones is carried out 
In the manner previously described. 

Ill) Assembly of Stochastic Genes Starting From A 
Group Of Heptamers . 

95 This procedure differs from that just discussed in 
that it utilizes palindromic heptamers which have 
variable cohesive ends, in place of stochastic 
sequences containing a smaller number of identical 
motifs. 

100 \ 

a) Choice of a Group of Heptamers 

h is possible, as an example, to use the following 
three palindromic heptamers: 
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5' XTCGCGA3V 
5'XCTGCAG3' 
5* RGGTACC3V 



where X-A, G, C.orT a*d R- AorT. and where 
110 polymerization cannot generate any "stop" codons 
and forms triplets specifying all the amino acids. 
Clearly it is possible to use other groups of 
m , heptamers fulfilling these same conditions. 

115 bj Polymerization ofa Group of Heptamers ^ 

This polymerization is carried out.exactly in the 
fashion described above for octamers. 

cl Elimination Of Cohesive Extremities 
120 The polymers thus obtained have one unpaired 

base on their two 5* extremities.Thus. it is 
necessary to add the complementary base to the . 
corresponding 3' extremities. This is earned out as 
follows: tO micrograms of the double stranded 
125 polymers are reacted With 10 units of the Klenow 
enryme. in the presence of the four 
deorynucleotideohosphates 1200 mM) in * volume 
of TOO ul. at 4'C. for 60 minutes The enryme is 
inactivated by phenol choio'o'o'm ext f aci»on. and 
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nucleotides bydifferential precast' 0 "- 
polymers »rt then licated to the host plasmid 
(previously lineari:sd and dephosphorylated) by 
following the procedures described above. 

It is to be noted that the two last procedures which 
were described utilize palindromic octamers or ^ 
hepiame.-s which constitute specific sites of certain 

restriction enr/mes. These sites are absent, for the _ 
most pan. from the pUC8 expression vector. Thus, it 
is possible to augment considerably the complexity 
of an initial preparation of stochastic genes b V 
proceeding in the following way. the plasmid DNA 
derived from the culture of 10' independent 
transformants obtained by one of the two last _ 
procedures described above, is isolated. After this 
DNA is purified, it is partially digested by Clal 
restriction enzyme (procedure II) or by the Pstl ^ 
restriction enzyme (procedure III). After inactivation 
of the enzyme, partially digested DNA is treated with 
li ONAIigase. which has the effect of creating, a 
very large number of new sequences, while _ 
conserving the fundamental properties of. the initial 
sequences. This new ensemble of stochastic 
sequences can then be used to transform competent 
25 cells. In addition.-the stochastic genes cloned Cry 
procedure II and III car. be excised intact from the 
pUCS expression vector by utilizing restriction sites 
belonging to the cloning vector and net representee 
in the stochastic DNA sequences. 

Recombination within the stochastic 8 en « 
generated by the two procedures just described, 
which result from the internal homology due to tne 
recurrent molecular motifs, is an important ^ 
additional method to achieve in vivo mutagenesis 01 
35/ the coding seauencss. This results in an 
• ' augmentation o( the number of new genes whicn 

can be examined. ■ t 

. Finally, for all the procedures to generate novel 

synthetic genes, it is possible to use a number of 
40 common techniques to modify genes in vivo or in 
vitro, such as a change of reading frame. inversion 
of sequences with respect to their promotor, point 
mutations, or utilization of host cells expressing one 
or several suppressor tRNAs. 
45 In considering the above description, it ts clear 
- thai it is possible t o construct, in vitro, an ex^ejpaiy. 
m.^rHgr^mple greater than a^biHjgnj . 
dtrferent genes, by enzymatic polymerization of 
nucleotides or of oligonucleotides. This _ wmmmnm . 
■■ 50 polymerization is carried out in a stochastic manner, 
as determined by the respective concentrat.ons ; of 
the nucleotides or oligonucleotides present in the 

reaction mixture. .rtsni-rt - ' 

As indicated above, two methods can be Utilized 
to clone such genes (or coding sequences): tne 
polymerisation can be carried out directly on ■ _ 
cloning expression vector, which was P r ^ 00 f'Y 
l,near,:ed. or ,t is possible to proceed sequentially 
to the polyme'iiat.on then the l.gal'Onoftho 
polymers to the expression vector. „.,.„_ or 



Clea'iy. in addition to the orocedu'es which were 
desc;bed above, it is feasible to use all other 
methods which are aop'Opriate fo'-the synthesis of 
stochastic seauences. In particular, it is possible lo 
carry oat polymerization, biochemical means, of 
single stranded oligomers of DNA or RNA obtained 
by chemical synthesis, then treat these segments of 
DNA or RNA by established procedures to generate 
double 'stranded DNA IcDNAl in order to clone such 
genes. 
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Screening Or Selection Of Clones Of Transformed 
Host Cells 

The further step of the procedure according to the 
invention consists in examining the transformed or 
transfected eell s , bv selection or scrg onin^ in order 
to isolate one or several ceils whose transforming or 
trnnif - rtinq PN A 'f n«*' in th^vr^psis ^ * 

transcription product (RNAl * r translation product 
(protein) h avin g nVsir^ n r 0P pr?v - These properties 
can be, for example, enrymatic, functional, or 
structural. 

One of the most important aspects of the process, 
according to the invention, is thjl jvptrrnitfjafc. . 
«im^I:anfrOus scr^ pjpq of ^'^'nn nf an 
exploitable ornr l nrt f MN n nr "^' ! ^ *"* r - p Qe — 
w^ifS nrnrl vrr i T w ^"^nrf In addition, the DNA 
synthetized and cloned as described, can be 
selected or screened in order to isolate sequences of 
95 DNA constituting products in themselves, having 
exploitable biochemical properties. . 
We shall now describe, as non-limitating 

examples, preferred procedures for screening or 
selection of clones of transformed cells such that the 
novel proteins are of interest from the point of view 

of industrial or medical applications. ; 

One of these procedures rest in the idea of 
' producing, or obtaining polyclonal or monoclonal 
antibodies, by established techniques, directed 
against a protein or another type of molecule of 
biochemical or medical interest, where.that ^ 
molecule is. or has been rendered.. immunogenic 
and thereafter using these antibodies as.probes to . 
"identify among the very large number of clones 
transformed by stochastic genes, these whose ^ 
protein react with these antibodies, This reaction is 
a result of a structural homology which exists 
between the polypeptide syrithesiied by the 
stochastic gene and the initial molecule, h is 
possible in this, way to isolate numbers of novel 
proteins which behave as epitopes or-antigemc 
determinants on the initial molecule. Such novel 
proteins are liable to simulate, stimulate modulate, 
or block the effect of the initial molecule. It v/.ii oe 
dear that this means of selection or greening may . 

itself have very many pharmacologic and 
biochemical applicat.o-s. Below we desenbe s 

non .uniting example, tms first mode of operate n 
a concrete case: .Hnfntein 

EGF. (epidermal growth factor is ; . *«£P e 
present in the blood, whose role -s to t,m j«» 
growth of epithelial cells. Th-s.eKee, .» o t.mefl 
the imeraciion o'EGF w.th a aoee.hc reeeB» 
,i, u « eS .n the membrane of eo-thea? « • 
.A-.t.bod.es d.'ectefi aca-nsi £G C a'e 3 tp 
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iniectmc animals wi;h IQF coupled to KLH (keyhole 
limpet hemocyanml to augment the 
immunogeniciry of the EGP. The anti-EG? 
antibodies of the immunized animals are purified, 
5. for example, by passage over an a rfinity column, 
where the ligand is EG? or a synthetic peptide 
corresponding to a fragment of EGF. The purified- ' 
enti-EGF antibodies are then used as probes to 
screen a large number of bacterial clones lysed by 
10 clcoform. and on a solid support. The anti-EGF 
antibodies bind those stochastic peptides Of 
proteins whose epitopes resemble those of the 
initial antigen. The clones containing such peptides 
or proteins are shown by autoradiography after 
1 5 incubation of the solid support with radioactive 

protein A. or a her incubation with a radioactive anti- 
antibody antibody. 

These steps identify those clones, each of Which 
contains one protein (and its gene) reacting whh the 
20 screening antibody , h is feasible to screen among a 
very large number of colonies of bacte ri al celts or. 
viral plaques (for example, on the o rdMPX^__ 
T000 QOOl and it is feasible to detect extre mtely. ' 
small quantities. -on the order of 1 nanogram, of . 
25 protein product. 7h»»pa^pf, the identified clones are 
cultured flnrj (hp rwins sp detected are purified in 
jiq nve p. 1 io n a I wa vs . These proteins are tested in 
vitro in cultures of epithelial cells to determine if 
they inhibit, simulate, or modulate the effects of EGF 
30 on these cultures. Among the proteins jo obtained, 
some may be utilized for the chemotherapeutic 
treatment of epitheiicrr.es. The activities of the 
proteins thus obtained can be improved by 
mutation of the DNA coding for the proteins, in 
35 ways analogous to these described above. A variant 
of this procedure consists in purifying these . 
stochastic peptides, polypeptides or proteins, which 
can be used as vaccines or more generally to confer 
an immunity against a pathogenic agent or to 
40"" exercise other effects on the immunological system, 
for example, to create a tolerance or diminish 
hypersensitivity with respect to a given antigen. In 
particular due to binding of thase peptides, 
polypeptides or proteins with the antibodies 
45 directed against this, antigen. It is clear that h is 
possible to use such peptides, polypeptides or 
proteins in vitro as well as in vrVo. - 
More precisely, in the ensemble of novel protein* 
which react with the antibodies against a given 
50 antigen X. each has at least one epitope in common 
with X, thus the ensemble has an ensemble of 
epitopes in common with X. This permits utilization 
of the ensemble or sub-ensemble*as a vaccine to 
conf e r immunity against X. ft is, for example, easy to 
55 purify one or several of the capsid proteins of the 
hepatitis B virus. .These proteins can then be. 
injected into an animal, for example. 1 rabbit, and 
the antibodies corresponding to the initial antigen 
can be recovered by affinity column purificati n. 
60 These antibodies may be used, as described above, 
to identify clones producing at least one protein 
having an epitope resembling at leatt one of the 
epitopes of the init.e' antigen, Ahe r pu f ' r, cation, 

these p'Ote-^s a'e used s\ antigens le ; :^e f alone or 
C c . ...... . — - ' ' * 



protection aca-nst hepatitis £ The final production 
of the vaccne does not require furrier access to the 
initial pathogenic agent. 

Note that. duringMhe desciption of the 

70 procedures above, a number of means to achieve 
selection or screening have been described. All 
these procedure may reautre the purification of a 
particular protein from.a transformed done. These 
protein purifications can be carried out. by 

75 established procedures and utilize, in particular, the 
techniques of gel chromatography, by ion' 
exchange, and by affinity chromatography. In 
addition, the proteins generated by the stochastic 
genes can have been cloned in the form of hybrid 

80 proteins ha\ ing. for example, a sequence of the JJ- 
galactosidase enzyme which permits affinity 
chromatography against anti-3-galactosidase 
antibodies, and allows the subsecuent cleavage of 
the hybrid pan (that is to say, allowing separation of 
- 85 the novel pan and the bacterial pan of the hybrid 
protein. Below we describe the principles and 
procedures (p reelection of p»gtid»s " r polypeptides 
jnd the corresponding c/enes. according to a second 
method of screening or selection b ased on the 

90 _detfrCt jflp n* >Sp np.ar^ pffp»«« pegtifles or 

.polypeptides to catalyse a specific reaction. 

As a concrete and non limiting example, 
screening or selection in the particular case of 
proteins capable of catalysing the cleavage of 
95 lactose, normally a function fulfilled by enzyme fJ- 
galactosidase (P-gal) will be described. 

As above described, the first step of the process 
consists in generating a very large ensemble of • 
.expression vectors, each expressing a distinct novel 
100 protein. To be concrete. for,exancle. one may 

choose the pUC3 expression vector with cloning of 
ttochastic sequences of DNA in the.Psti restriction 
site. The plasmicls thus obtained are then, 
introduced in a clone of E coli from whose genome 
105 the natural gene for B-galactosidase.Z. and a 

wcondgene EBG, unrelated to the. first but able to 
. mutate towards P-gal function, have both'been 
eliminated by known genetic methods. Such host- ' 
cells (2',E3G"| are not able. by themselves.to 
11 0 catalyse lactose hydrolysis, and as a consequence to 
use lactose as carbon source for growth. This 
permits utilization of such host clones' for screening 
or selection for 3-gal function. 
A convenient biological assay to analyse 
115 transformed E coli' clones for those which have 
novel genes expressing a B-gal function consists in 
the culture of bacteria transformed as described in 
- petri dishes containing X-gal in the medium. In this 
case, all bacterial colonies expressing a P-gal 
1 20 function are visualized as blue colonies. By using 
tuch a biological assay, it is possible to detect even 
weak catalytic activity. The specific activity of 
characteristic e.irymes ranges from 10 and 10.000 
product molecules per second. 
125 Supposing that a protein svnthesi/ed by a 

nochastic gene has a weaV soecif»c activity, on the 
, order of one molecule per 100 seconds.it remains 
possible to detect such catalytic activity. In 1 pet" 
dis* containing X-ca? in the med-y^. »nd ,,n the 
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(isooropyl-D-thiogaiacioside) visualisation oj a blue 
region recuires cleavage of about 10' to 10 
molecules of.X-gal pef square millimeter. A 
bacterial colony expressing a wea* er.r/rr\c ano^ 
5 occupying e surface area of 1mm' has about 10 to 
10' cells If each cell has only one copy of the weak 
enryme. each cell would need to catalyse cleavage 
of ber^etn 10'OCO and 100 of x-gal to be detected. 

. which would require between 2.7 and 270 hours. 
10 Since under selective conditions it is possible to 

amplify the number of copies of each plasmid per 
cell to 5 to 20 copies per cell, or evento 100 to 1000. 
and txcause up to 10% of the protein of the cell can 
be specified by the new gene, the duration needed 
15 to detect a colony blue in the case of 100 enzyme 
molecules ol weak activity per cell is on the order of 
0.27 to 2.7 hours. 

' As a consequence of these facts, screening a very 
large number of independent bacterial colonies. 
20 each expressing a different novel gene, and using 
the capacity to express a B-gal function as the 
selection criterion, is fully feasible. It is possible to 
carry out screening of about 2000 colonies in one 
Petri dish of 10 cm diameter. Thus, about 20 million 
25 colonies can be screened on a sheet of X gal agar of 
1 square meter. 

It is to be noted that bacterial colonies which 
appear blue on X gal Petri dishes might be false 

positives due to a mutation in the bacterial genome 
30 which confers on it the capacity to metabolize _ 
lactose, or for other reasons than those which result 
from a catalytic activity of the novel protein 

. expressed by the cells of the colony.' Such false 
positives can be directly eliminated by purifying the 

35 DNA of the expression vector from the positive 
colony, and ^transforming Z". EBG'E coli host _ 
cells. If the p-gal activity 'is due to the novel protein • 
coded by the new gene in the expression vector all 
-those cells transformed by that vector will exhibit p. 

40 gal function. In contrast, if the initial blue colony .5 
due to a mutation in the genome of the host cell. It is 
a rare event and independent of the transformation, 
thus the number of cells of the new done of 
transformed E coli capable of expressing gal 
45 function will be small or zero. . 

The power of mass simultaneous punfication of 
all the expression vectors from all the positive 
clones (blue! followed by ^transformation of naive 
bacteria should be stressed. Suppose that the aim is 
50 to carry out a screening to select P'9^» h «« no ' w 
catalytic function, and that the probability that a new 
pept.de or polypeptide carries out th^funetjn at 
least weakly is 10-. while the probability that a 
clone of the E coli bacterial host undergoes a. _ 
55 : mutation rendering it eaoable of carrying out*a 
same function is 10-. then it can be ealeuUt«J th» t 
• among 20 mHHon transformed bacteria which ar« 
screened. 20 positive clones will be "nr.butable to 
. the novel genes in e ,pr«sion vectors 
60 carries, while 200 pos.tive clones w,ll be the result ot 

oenomic mutation. Ma JS PurificJl.on oftne 
expr C ssionv e c.orsfrom« h eto«. ! ol220pos...ve 

bacte'.al clones flowed bv , e i.a«He'm.i.on of 
L! bae.e-.a ~ ir. «*e m.««r. o' MP'ess-on 



clones consisting of aH tKose bacteria transformed 
with the 20 expression vectors which code fc the 
novef proteins having the desired function, and a ■ 
very small number of bacterial clones resulting from 
genomic mutations and containing the 200 
expression vectors which are not o! interes t A smal t 
number of cvclgl ,'"' < -r*r.»»' i nP p±g*;Vttsio,n_ 
vectors from po^yy* hg r TP " fl1 rn'nniM f ollowed hv 
such ^tra nsformation, allows the detection of very 
75 rare expression vectors truly^ osilivejor_a_desired 
. , catalytic activity d espite a high background rate of 
mutations in the host celts for the same function. 

Following screening operations of this type, h is 
possible to purify the new protein by established 
80 techniques. The production of that protein in large 
quantity is made possible by the fact that 
identification of the useful protein occurs together 
with simultaneous identification of the gene coding 
for the same protein. Consequently, either the same 
expression vector can be used, or the novel gene 
can be transplanted into a more appropriated 
expression vector for Us synthesis and isolation in 
large quantity. 

h is feasible to apply this method of screening for 
any enzymatic function for which an appropriate 
biological assay exists. For such screenings, it is not 
necessary that the enrymatic function which is 
sought be useful to the host cell. It is possible to 
carry out screenings not only for an enrymatic 
95 function but for any other desired properry for 

* which it is possible to establish an apcropriate 
biological assay. It is thus feasible to carry out, even 

in the simple case of P-gal function visualized on an 
X*gai Petri plate, a screening of on the order of 100 
1 00 million, or even a billion novel genes for a catalytic 
activity or any other desired property. 
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Selection Of Transformed Host Cells 
On the other hand. "it is possible to use selection 

105 techniques for any property, catalytic or otherwise, 
where the presence or absence of the property can 
be rendered essential for the survival of the host 
cells containing the expression vectors which code 
for the novet.genes. or also can be used to select for 
110 those viruses coding and expressing the desired 
novel gene. As a non-limiting, but concrete . 
* example, selection for fl-galactosidase fun . c ^ 0 " 
' shall be described. An appropriate done of 2 E3G 
E. coli is not able to grow on lactose as the sole 
115 carbon source. Thus, after carrying out the First step 
described above, h is possible to culture a very large 
.'■ number of host cells transformed by the expression 
vectors coding for the novel genes. unde' selective 
conditions, either by progressive diminution of 
120 other sources of carbon, or utilization of lactose . 
•lone from the sua During the course of wen 
selection, in vivo mutagenesis by recombination or 
by explicity recovering the expression veaorsinj ; 
. mutaoeniting their nove' genes *n v/.'.'C oy v 
125 mutagens, or by any other common < eeb ^ u ' ■ 
permits adaptive improvements in the capaciry 
fulfil the des.red cjta'yi.c function. When botn 
seieet.on.techn!coes and conven.e'tb'oa J " Y rtenl 
techniques ex»st at the same .'me. as t.-e p 
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initially io enrich the representation of host bacteria 
expressing the Q. gal function, then carry out a 
screening on a Petri plate on X^gel medium to 
establish efficiently which are the positive cells. In 
5 the absence of convenient bioassays. application of 70 
progressively stricter selection i s the easiest route to 
purify one or a small number of distinct host cells 
whose expression vectors code for the proteins 
catalysing the desired reaction. 

10 (I is possible to utilize these techniques to find 75 
novel proteins having a large variery of structural 
and functional characteristics beyond the capacity 
to catalyse a specific reaction. For example, h is 
possible to carry out a screen or select for novel 

1 5 proteins which bind to cis-regulatory sites on the 80 
UNA and thereby block the expression of one of the 
host cell's functions, or block transcription of the 
• DNA, stimulate transcription, etc. 

For example, in the case of E colt, a clone mutant 

20 in the repressor of the lactose operon (i-) expresses 85 
P-gal function ccns;itutive!y due to the fad the 
lactose operator is not repressed. All cells of this 
type produce blue clones on Petri plates containing ' 
X-gal medium. It is possible to transform such host 

25 calls with expression vectors synthesizing novel . 90 
proteins and carry out a screen on X-gal Petri plates 
in order to detect these clones which are not blue. 
Among those, seme represent the case where the 
new protein binds to the lactose operator and 

30 .represses the synthesis of 0-gal. K is then feasible to 95 
mass isolate such piasmids. retransform. isolate 
those clones which do not produce S-gal. and 
thereafter carry out a detailed verification. 
As mgntign*^ > w<» t^o process can be utilized 

35. in order rn rr»t a »k«„ icnfptP not only exploitable - 100 
Proteins h\ t* also ^nfl pN A a< ornriucts in . ^ 

themsMv cs. having exploitable properties. This 
results from the fact that, on one hand, the 
procedure'eonsists in creating stochastic sequences 

4D of DNA which may interact directly with other 105 
cellular or biochemical constituents, and on the 
other hand, these sequences cloned in expression 
vectors a re transcribed into RNA which are 
. themselves capable of multiple biochemical 
interactions. 110 

• An Example Of The Us e Of The Procedure To Create 
And Select For A^^Which Is Useful In Itself. 
This example illustrates selection for a useful . 

50 DNA, and the purification and study of the . 115 
mechanism of action of regulatory proteins which 
bind to the DNA. Consider a preparation of the 
oesuadiol receotor. a protein obtained by standard 
techniques. In the presence of oestradiol. a steroid 

55 sexual hormone, the receptor changes * 120 
conformation and binds tightly to certain specific 
sequences in the genomic DNA. thus affecting the 
transcriotion of genes implicated in sexual . 
differentiation and the control of fertility J3y 

60 inCu bfltino a mixture contain ing ftrTTrr rf T l ^^5 
Lgceptpr. ind a l*rn» n .tmK^r of rl^'r^ tffyH***^ 
DNA sequences inse^ed m tneir vectors, followed. 



. by filtration of tfte mnur* across « »*t'Q ^''ulosE- 
meT.Srane, pnehai a direct *<<eciiQn for jhn*^ 
65 siocnassc Dn- se j^ences b-ngmg to the o estroqei* 130 



/eceoior co mplex, wh; f ; ortv rhn s . DNA , hntt ^ ^ 
a p'otem are retamec by the membrane Ahtr " 
washing and elution. me DNA liberated from the 
membrane i s utilired as such to tran sform bacteria. 
After culture oi the transio/mec oactena. tne vectors 
which they contain are again purifie d and on> nr 
several c ycles of incubation, filtration and 
transformatio n are carried out as described ahnv^ 
These procedures allow the isolation of stochastic 
sequences of DNA having an elevated affinity for the 
oestradiol-receptor complex. Such sequences are 
open to numerous diagnostic and pharmacologic 
applications, in particular, for developing synthetic 
oestrogens for the control of fenility and treatment 
of sterility. 

Creation And Selection Of An RNA' Useful In Itself . 

Let there be a large number of stochastic DNA 
sequences, produced as has been described and 
cloned in an expression vector, h follows that the 
RNA transcribed from these sequences in the 
transformed host cells can be useful products 
themselves. As a non limiting example, it is possible 
to select a stochastic gene coding for a suppressor 
transfer RNA {tRNAJ by the following procedure: A 
large nu mber f £ 10'} of stochastic iecugnep^ are 
transformed into competent baciy'sf hests carrying 
^a "nonsense'' mutyi*-. in |he 3rc £. gene. These 
transformed bacteria are plated on minimal medium 
without arginine and with the selective antibiotic for 
that plasmid (ampicillin rf the vector is pilC3). Only 
those transformed bacteria which have become 
capable of synthesiring arginine will be able to 
grow. This phenotype can result either from a back 
mutation of the host genome, or the presence in the 
cell of a suppressor. It is easy to test each 
transformed colony to determine if the arg +' 
phenotype is or is not due to the presence of the 
stochastic gene in its vector; it suffices to purify the 
plasmid from this colony and verify that it confers 
an Arg+ phenotype on ill arge E cells transformed . 
by it. 

Selection Of Proteins Capable Of Catalysing A 
Seouence Of Reactions 

Below we.describe another means of selection, 
open to independent applications; based on the v 
principle of simultaneous and parallel selection of a 
certain number of novel proteins capable of 
catalysing a connected sequence of reactions. . 

The basic idea of this method is the. following:', 
given an initial ensemble of chemicai.compounds 
considered as building blocks or elements of 
construction from which it is hoped to synthetire 
one or several oesired chemical compounds by 
means of a catalysed seouence of chemical 
reactions, there exists a very large.number of 
reaction routes which can be partially or completely 
substituted for one another, which a'e an 
thermodynamrcalty possible, and which lead from 
the set of building blocks to the desired target . 
compoundlsl. EH.ciem synthesis of a target 
compound is favored if each steo of at least one 

reaction pathway leading from the bu'id-tg b'ocV 
compo jno$ to tne ta'Oft cn'-.mu^r: >s cr^o^sed c 



9 



reactions each of which is catalysed. On the other 
hand, it is relatively less important to determine . 
which among the many independent or partially 
independent reaction pathways is catalysed. In the 
5 previous description, we have shown how it is 70 
possible to obtain a very large number of host celts 
each of which expresses a distinct novel protein. 

Each of these novel proteins is a candidate to 
catalyse one or another of the possible reactions, in 
10 the set of all the possible reactions leading from the 75 
. ensemble of building blocks to the target 
compound. If a sufficiently large number of 
stochastic proteins is present in a reaction mixture 
. containing the building block compounds, such that 
15 a sufficiently large number of the possible reactions 80 
* are catalysed, there is a high probability that one 
connected sequence of reactions leading from the 
*et of building block compound to the target 
compound will be catalysed by a subset of the novel 
20 proteins. h« dear that this procedure can be ob 
extended to the catalysis not only of one. but of 

several target compounds simultaneously. 

Based on this principle it is possible to proceed as 
follows in order to select in parallel a set of novel 
25 P'oteins catalysing a desired sequence of chemical 
reactions: 

1. Specify the desired set of compounds 
constituting the building blocks, utilizing ^ _ 
preferentially a reasonably large number of distinct 

30 chemicai species in order to increase the number of 
potential concurrent reactions leading to the desired 

target compound. 

2. Using an appropriate volume of reaction 
medium, add a very large number of novel 

35 stochastic proteins isolated from transformed or 100 
transferred cells synthesizing these proteins. Carry 
out an assay to determine if the target compound is 
formed. If it is. confirm that this formation require*' 
the presence of the mixture of novel proteins. If sc. 

40 then the mixture should contain a subset of proteins 105 

catalysing one or several reaction pathways leading 
from the building block set to the target compound. 

Purify and divide the initial ensemble of clones 
which synthesize the set of novel stochastic 
45 . proteins, the subset which is required to catalyse the 1 iu 
sequence of reactions leading to the target 

compound. . 

More precisely, as a non limiting example, below . 
we describe selection of novel proteins capabteof 
50 catalysing the synthesis of a specific small pept.de. 1 ^ * 
in particular, a pentapeptide. starting from a 
building block set constituted of smaller peptides 

and amino acids. All peptides are constituted by a . 
linear sequence of 20 different types of am.no acids. 
• 55 oriented from the amino to the carboxy terminus. ^ 

Any peptide can be formed in a single-Step by the 
. terminal condensation of two smaller peptides lor 01 

two amino acidsi; or by hydrolysis of a larger 
cxpt.de. A peptide with M residues can thus be 

60 formed by M-l condensation reacfons. The numDer 
of reactions. A. by which a set of peot.des having . 
length l. 2. 3. ... M res-dues can be interconverted 
is larger tha« the nu^berofposs-b»e molecular 
*n^.»* T T*.. — *,nr^<^rt a* * • Thus. 



large numbe f of independent or partially 
independent reaction pathways lead to the 
synthesis of a specinc target peptide. Choose a 
pentapeotide whose presence can be determined 
conveniently by tome common assay technique lor 
example HPLC {liquid phase high pressure 
chromatography), paper chromatography, etc. 
Fo'maiion of a peptide bond requires energy in a 
dilute aqueous medium, but if the peptides 
participating in the condensation reactions are 
adequately concentrated, formation of peptide 
bonds is thermqdynamically favored over 
hydrolysis and occurs efficiently in the presence of 
an appropriate enzymatic catalyst, for example 
pepsin or trypsin, without requiring the presence of 
ATP or other high energy compounds. Such a 
reaction mixture of small peptides whose amino 
acids are marked radioactively to act as tracers with 
3 H, U C. 3S S, constituting the building bloci set can 
be used at sufficiently high concentrations to leaeTto 

condensation reactions. 
For example, it is feasible to proceed as follows: 

15 mg of each amino acid arid small peptides having 
2 to 4 amino acids, chosen to constitute the building 
block set. are dissolved in a volume of (US ml to 1-0 
ml of a 0,1 M pH 7.6 phosphate buffer. A large 
number of novel proteins, generated and isolated as 
described above are purified from their bacterial or 
other host cells. The mixture of these novel proteins 
is dissolved to a final concentration on the order of 
0,8 to 1 .0 mg/mi in the same buffer! 0.25 ml to 0.5 ml 
of the protein mixture Is added to the mixture of 
building blocxs. This is incubated at 2VC to dO'C for 
1 to 40 hours. Aliqucts of 8 ul are removed at regular 
intervals, the first is used as a "blank" and taien* 
before addition of the mixture of novel proteins. 
These aliquots are analysed by chromatography 
using n-butanol-acetic acid-pyridine-water 
(30:6:20:2d by volume) as the solvent. The 
chromatogram is dried and analysed by ninhydrin 
or autoradiography (with or without intensifying 
screens). Because the compound constituting the 
building block set ire radioactively marled; the 
targ et compound will be radioactive and it will have 
specific activity high enough to permit detection at 
the level of 1—10 ng. In place of standard . 
chromatographic analysis, h is possible to use HPtC 
(high pressure liquid chromatography) which is 
faster and simpler to carryout. More generally, all 
the usual analytic procedures can be employed: 
Consequently it is possible to detect a yield of the 
target compound of less than one pan per million oy 
weight compared to the compounds used as .-initial 
building blocxs. 

tf the pentaoeptideis formed in the conditions 
described above. but not when an extract .s irtil.ied 
which is derived from host cells transformed by en 
expression vector conta^ng no 
the formation of the pentapeotide >s no: the result of 
bacterial contaminants and thustecu.'ts me 
presence of a subset of the novel P'Ote-s >n the 



reaction mixture. Vr at on of 

The following siec consists ^* * e5a 
the pan.cula' s.-t>se: c' ceMs *h»c* €3*1 a n 



catalysing thr sequence of reactions leading to the 
target pentapeptide. As an example, if the number 
of reactions forming this sequence is 5. there are 
about £ novel proteins which catalyse the necessary 
5 reactions. If the clone bank of bacteria containing 
the expression vectors which code for the novel 
genes has a number of distinct novo! genes which is 
on the order of VOOO'000 all these expression 
vectors are isolated en masse and retransformed 
10 into 100 distinct sets of 10* bacteria at a ratio of 
vectors to bacteria which is sufficiently low that, on 
average, the number of bacteria in each »et which 
are transformed is about half the number of initial 
genes, i.e. about SOO'OOO. Thus, the probability that 
15 any given one of the 100 sets of bacteria contains 
the entire set of 5 critical novel proteins is (1/2) , *1/ 
32. Among the 100 initial sets of bacteria, about 3 
will'contain the 5 critical transformants. In each of 
-these sets/the total number of new genes is only 
20 500'COO rather than VOOO'000. By successive 

repetitions the total number of which is about 20 in 
the present case, this procedure isolates the 5 
critical novel genes. Following this, mutagenesis 
and selection on this set of 5 stochastic genes allows 
25 improvement of the necessary catalytic functions. In 
a case where it is necessary to catalyse a s-equence 
of 20 reactions and 20 genes coding novel proteins 
need to be isolated in parallel, it suffices to adjust 
the multiplicity of transformation such that each set 
30 of 10 1 bact eria recaives 80% of the 10 4 stochastic 
, genes, and to use 200 such sets of bacteria. The 
probability that all 20 novel proteins are found in 
one set is 0.8*^0.015. Thus, about 2 among the 200 
sets will have the 20 novel genes which are needed 
35 to catalyse the formation of the target compound. 
The number of cycles required for isolation of the 20 
novel genes is on the order of 30.. 

The principles and procedures described above 
generalize from the use of peptides to numerous 
40 areas of chemistry in which chemical reactions take 
place in aqueous medium, in temperature, pH # and 
concentration conditions which permit genera! 
enzymatic function. In aach case it is necessary to 
make use of an assay method.to- detect the 
45 formation of the desired target compound(s). his 
also necessary to choose a sufficiently large number 
of building block compounds to augment the * . 
number of reaction sequences which lead to the 
target compound. 
50, The concrete example whichwas given forthe 
synthesis of a target pentapeptide can also be 
generalized as follows: * 

The procedure as described, generates among 
other products, stochastic peptides and proteins, 
'55 These peptides or proteins can act. catajytically orln 
other ways, on other compounds. They can equally 
constitute the substrates on which they act. Thus.it 
is possible to select (or screen) for the capacity of 
such stochastic peptides or proteins to interact 
60 among themselves and thereby modiry the 

conformation, the structure or the function f some 
among them. Similarly, it is possible to select lor 
iccen) iot the capacity of these peDtides and ^ - 
p'Ote.ns to cata'vse a^o^c tr^e-nseVes h.-Co'vsis. 



modifying the peptides. Fo' example, the hydrolysis 
of a given stochastic peptide by at least one member 
of the set of stochastic peptides and proteins can be 
followed and measured by radioactive marWng of 
70 the given protein followed by an incubation with a 
mixture of the stochastic proteins in the presence of 
ions such as Mg. Ca. Zn, Ft and AT? or GTP.The 
appearance of radioactive fragments of the marked 
protein is then measured as described; The 
75 stochastic proteinfsl which catalyse this reaction 
can again be isolated, along with the gene(s) 
producing them, by sequential diminution of the 
library of transformed clones, as described above. 
An extension of the procedure consists in the . 
80 selection of an ensemble of stochastic peptides and 
polypeptides capable of catalysing a set of reactions 
leading from the initial building blocks (amino acids 
and small peptides) to some of the peptides or 
polypeptides of the set. h is therefore also possible 
85 to select an ensemble capable of catalysing hs own 
synthesis; such a reflexivety autocatalytic set can be 
established in a chemostat where the products of . 
the reactions are constantly diluted, but where the 
concentration of the building blocks is maintained 
90 constant Alternatively, synthesis of such a set is 
aided by enclosing the complex set of peptides in 
liposomes by standard techniques. In a hypertonic 
aqueous environment surrounding such liposomes, 
condensation reaction forming larger peptides 
95 lowers the osmotic pressure inside the liposomes, 
. 'drives water molecules produced by the 

condensation reactions out of the liposomes, hence . 
favors synthesis of larger polymers. Existence of 
such an autocatalytic ensemble can be verified by 
100 two dimensional gel electrophoresis and by HPLC, 
showing the synthesis of a stable distribution of 
peptides and polypeptides. The appropriate reaction 
volume depends on the number of molecular 
species used, and the concentrations necessary to 
105 'favor the formation of peptide bonds over their 
' hydrolysis. The distribution of molecular species of 
an autocatalytic ensemble is free to vary or change 
due to the emergence of variant autocatalytic 
ensembles. The peptides and polypeptides which 
110 constitute an autocatalytic set may have certain 

elements in common with the large initial ensemble 
(constituted of coded peptides and polypeptides as 
given by our procedure) but can also contain 
peptides and polypeptides which are not coded by 
115 the ensemble of stochastic genes coding forthe 

Initial ensemble. 

The set of stochastic genes whose products are . 
necessary to establish such an autocatalytic set can 
be isolated as has been described, by sequential 

120 diminution of the library of transformed clones. In 
addition, in autocatalytic set can contain coded 
peptides initially coded by the stochastic genes and 
synthesized continuously in the autocatalytic sct;To 
isolate this coded subset of peptides and proteins. 

125 the autocatytic set can be used to obtain, through 
Immunization in an animal, polyclonal se'i 
recognizing a very large numoef.of constituents o 
the autocatalytic set. 

These se-a ca* be vWw* xr***~ I'b'anr o 
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proteins able to combine with the antibodies 
present in the sera. 

. This set of stochastic genes expresses a large ^ 
number of coded stochastic proteins which persist 
5 In the autocatalytic set. The remainder of the coded 
constituents of such an autocatalytic set can be 
isolated by serial diminution of the library of 
stochastic genes, from which the subset detected by 
immunological methods has first been subtracted. 
10 Such autocatalytic sets of peptides and proteins, 
obtained as noted, may find a number of practical 
applications. 

CLAIMS 

15 Q]Prnr W fnMh»prnrftirti'fin of Peptides Or 

polypeptides by microbiological means, 
characterized in that genes which are at least 
partially composed of stochastic synthetic 
polynucleotides ara produced simultaneously In a 
20 common milieu, that the genes thus obtained are 
introduced into hcstcslls. that the independent 
clones of the transformed host cells containing 
These genes are simultaneously cultivated so as to 
clone the stochastic genes and lead to the 
25 production of proteins expressed by each of these 
stochastic genes, that screening and/or selection is 
carried out on such clones of transformed host cells, 
to identify those clones producing peptides or 
. polypeptides having at least one specified property, 
30 that the clones so identified are isolated, then grown 
in a manner so as to produce at least one peptide or 
polypeptide having the said property. 

2. Process according to claim -1. characterized by 
the fact that the genes are produced by stochastic 
35 "polymerization of the four types of 

deoryphosphonucleotides A, C. G and T. starting 
from the rwo extremities of an expression vector 
: which was previously linearized, then by formation 
of cohesive extremities to create a firet. strand of 
40 stochastic DNA constituted of a molecule of 
. expressionyector possessing two stochastic 
sequences whose 3' extremities are 
complementary, followed by synthesis of the 
second strand of the stochastic DNA. 
45 3. Process according to claim 1. characterized by 

the fact that the genes ara produced by stochastic 
copolymerization of double stranded 
oligonucleotides which do not have cohesive ends, 
in a manner so as to form fragments of stochastic .. 
50 DNA. followed by ligation of these fragments in an 
expression vector which was previously linearized. 
* 4. Process according to claim 2 or 3, characterized 
by the fact that the expression vector is a plasmid. 

5. Process according to claim 4. characterized by 
55 the fact that the expression vector is pUC3. 

6. Process according to claim 2 or claim 3. 
-^characterized by the fact that expression vector is a 

fragment of viral DNA. ' 

7. Process according to claim 2 or clairn 3. 

60 characterized by the fact that the expression vector 
is a hybrid of plasmid and viral DNA; ■ 

8 Process according to claim 1 to 6. characterized 
by the fact mat the host ce"s are orokanronc cells. ^ 

9 Process acco'd.*a to cla.^s l to 7 characterized 



10. Process according to claim 8. characterized by 
the fact that the cells are chosen among H3101 and 
C600. 

1 t. Process according to claim 3, characterized by 
70 the fact that the oligonucleotides form a group of 
palindromic octamers. 

12. Process according to claim 11 characterized by 
the fact that the group of palindromic octamers is 
the following group: 

75 

5' GGAATTCC3' 
5'GGTCGACC3' 
5* CAAGCTTG 3* 
5' CCATATGG 3 f 
80 . 5* CATCGATG 3' 

13. Process according to claim 3, characterized by 
the fact that the oligonucleotides form a group of 
palindromic heptamers. 

fi5 14. Process according to claim 13, characterized 
by the fact th the group of palindromic heptamen 



is the following group: 



90 



5'XTCGCGA3* 
5'XCTGCAG 3' 
5* RGGTACC 3* 



where X=A, G.C.orT, and R*AorT. 

15. Process according to claim 4 and one of the 
daim 12 or 14, charaderizedby the fact that one first 

95 isolates and purifies the transforming DNA derived 
from a culture of independent clones of transformed 
host ceils obtained by proceeding in the manner 
specified in claim 1 1 or in claim 13. then that one 
cuts the purified DNA by means of at least one 

100 restriction enzyme which corresponds to a specific 
restriction site present in these palindromic 
octamers or heptamers, but absent from the 
expression vector being vtilized, that one thereafter 
simuttaneously treats the ensemble of linearized 

105 stochastic DNA fragments so obtained by T4 DNA 
lig ase in such a manner as to create a new ensemble 
of DNA containing new stochastic sequences, and 
that one uses this new ensemble of transforming 
DNA to transform host cells and clone luch genes. 

110 and finally that one carries out screening jnd/or 
seledion and isolates the new clones of 
transformed calls and that one grows these so as to 

produce at least one peptide or polypeptide having 

a desired property. 
115 16. Process according to claim 1. characterized by 

the fact that the said property is the capacity to 
catalyse a given chemical reaction. 

17, Process according to claim 1 for the 
produdion of several peptides »r\6tor polypeptides, 

1 20 charaderized by the fad that the said property « the 
• capacity to catalyse a sequence of reactions leading 
from a given group of initial chemical compounds to 
at least one target compound. . 

18. Process according to claim 16 for the ■ 
.125 produdion fan ensemble consisting of more than 

one peptide and/or polypeptide which is «nexiveiy 
autocatalytic. eharaderijed by the facttnat the sa>o 
• property. is the capacity to catalyse the synthesis oi 
the ensemble itself starting from amino acids anfl»o 
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15 



20 



25 



30 



19 Process according to claim 1. characterized by 
the fact that the said property is rhe capacity to 
modify selectively the chemical and/or biological 
properties of a given compound. 

20. Process according to claim 19. characterized 
by the fact that the said property is the capacity to 
modify selectively the catalytic activity of a 

polypeptide. 

21. Process according to claim 19. characterized 

by ihe fact that the said property is the capacity to 
simulate or modify at least one biological function 
of at least one biologically active compound. 

22. Process according to claim 21. characterized • 
by the fact that the said biologically active 
compound is chosen among the hormones, 
neurotransmitters, adhesion or growth factors, and 
specific regulators or replication and/or 
transcription of DNA, and/or translation of RNA. 

23. Process according to claim 1, characterized by 
the fact that the said property is the capacity to bind 
to a given ligand; 

24. Utilization of a peptide or polypeptide 
obtained by the process according to claim 23 for 
the detection and/or titration of the ligand. 

25. Process according to claim 1, characterized by 
the fact that the property is to have at least one 
epitope similar to one of the epitopes of a given 
antigen. 

25. Process according to claim 19 and 25, ; 
characterized by the fact that the said property It the 
capacity to simulate or modify the effects of a 
biologically active molecule, and that screening and/ 
or selection of the clones of transformed host calls 
producing at least one peptide'or polypeptide 
35 having this property is carried out by preparing ■ 
antibodies against that molecule, and utilizing these 
antibodies so obtained to identify those clones 
containing those peptides or polypeptides, then by 
growing the clones thus identified and separating 
and purifying the peptide or polypeptide produced 
by these clones, and finally by submitting these 
peptide(s) or pclypeptide(s) to an assay in vrfro to . 
verify that it has in fact the capacity to simulate or 
modify the effects of the said molecule. 

27. Peptides cr polypeptides obtained by the 
process according to claim 1 or claim 25. utilizable 
as active substances having a pharmacologic and/or 

chemotherapeutic action. 

28. Peptides or polypeptides, obtained by the ^ ' 

process according to claim 25. utilizable to dimmish. 
in vitro or in v/Vo. the concentration of free 
antibodies, specific for the said antigen, by 
formation of bonds between these peptides Of 
polypeptides and these antibodies. - ' m 

29. Peptides or polypeptides according to .claims 
27 or 28, utilizable as agents to suppress 
immunological hypersensitivity. 

30. Peptides or polypeptides, obtained by the 
process according to claim 25. utiluable as agents to 
create tolerance with respect to the said antigen. 
. 31. Process according to claim 25. characterized 
by the fact that the antigen is EGF. 

32 Pest.des or poiypept.des obtained by the 
of itess acco'rt.™-* «« *"U.m 31. uvi.ra^e 'o- the 



40 



45 



50 



55 



60 



33/Process according to claim 1. cha'acterired by 
the fact that clones of transformed host cells 
producing peptides or polypeptides having the 
desired property are identified and isolated by 
70 affinity chromatography on antibodies 

corresponding to a protein expressed by the natural 
fragment of the DNA hybrid. . 

3*4. Process according to claim 33. characterized 
by the fact that the natural fragment.of the DNA 
75 hybrid contains a gene expressing p-galactosidase. 
and that one identifies and isolates the said clones 
of transformed cell hosts by affinity 
chromatography with anti-fj-galactosidase 
antibodies. 

80 35. Process according to claim 1. or claim 34. 
characterized by the fact that after expression and 
purification of the hybrid peptides or polypeptides, 
the novel fragments are separated and isolated. 

36. Application of the process according to claim 
85 25 or claim 26. for the preparation of a vaccine. . 

characterized by the fan that antibodies against a 
pathogenic ag ent are obtained and used to identify 
those clones producing at least one protein hiving 
art least one epitope similar to one of the epitopes of 
90 the pathogenic agent that the corresponding clones 
• erf transformed host cells are grown in such a 
manner as to produce this protein, that the protein is 
isolated and purified from the cultures cf clones of 
cells and that this protein is used for ^production 
S5 of a vaccine against the pathogenic agent. 

37. Application according to claim 36, for the 
preparation of an anti*HVB vaccine, characterized by 
the fact that at least one capsid protein of the HVB 
virus is extracted and purified, that this protein is 

100 injected into the body of an animal capable of, 
forming antibodies against this protein, that these 
antibodies are recovered and purified, that these 
antibodies are used to identify those clones 
producing at least one protein having at least one 

epitope simile r to one of the epitopes of the HVB 
virus, that the clones of transformed host cells 
corresponding to these clones are grown in a ^ 
manner to produce this protein, that this protein is 
Isolated and purified from these cultures of host ^ 
cells, and that this protein is used for the production 
of an antNHVB vaccine. 

' 38. Process according tm claim 1, characterized by 
the fact that the host cells consist in bacteria of 
Escherichia coli type, whose genome contains 
neither the natural'p-galactosidase gene, nor the 
EBG gene, that is 2*. EG8" £ coli. and that these 
transformed host cells are cultured in an X-gal 
medium also containing the inducer 1PTG. that 
clones which are positive for (J-galactosIdase " 
function are detected in the culture milieu, that ■ 
thereafter this DNA is transplanted to.'a done of host 
cells appropriate for industrial production of at leasts 
one peptide, polypeptide or protein with 0- 
galactosidase function. -'-/thy 
39. Process According to claim 1/charactenrefl or 
the fact that the said property ii the caoac«ty to » . 

to a given compound. . . ' 

AO. Process according to claim 39. chi'acie"*" 
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• 41. Process according to claim AO. char »cw.z«J 
by the (act lh«< the said proteins are prote-ns ^ 
regulating the transcription activity or replication of 
DNA. • « 

5 42. Process according to claim 39. eharaeterued 
by the fact that the said compound is chosen among 
the sequences o( DNA and RNA. 

43. Proteins obtained by the process according to 
claim 40 or claim 42. 
1 0 (^Process 'or the production of DNA. • 
characterized by the fact that, in the same milieu, 
genes which are at least partially composed ol 
stochastic synthetic polynucleotides are P f f <JoeM * 
that the genes ~ r ~*~"*fr» introducadjgJQJiag 
it; cells in « manner to Dr ? ft""> «n ensemble Pi 

produce independent clones of l K » host ce " s ,0 - 
produced. tjm scBftaino unrW m'rnion is carriM 
out on this ans»»mhl» \a identity those host Ml Hj 
20 v^irh rnntrin those stochastic teojjencBSOlDri^. 
Tgvinojj tlMfl one desiredja grJ Cf y infll M ffiSL 
DNAisilolated from the iri rn^-* "'"" f " of th « 

host calls. t _ ...j 

. 45. Process according to claim 44, characterized 
25 by the fact that the said property is the capaaty to 
bind to given compound. ' . 

"4€. Process according to ciaim'45. characterized ^ 
by the fact that the said compound is chosen among 
the peptides, polypeptides and proteins. . 
30 47. Process according to daim 45, characterized 
by the fact that the said compound is a.compouna 
regulating the transcription activity or the 
replication of DNA. - . . . 

48. Process according to claim 47. characteniej 
35 by the fact that the said compound is a "9^"^ 
protein controlling the transcription or replication en 
• DNA. 



49. Utilisation of a sequence of. DNA obtained by 
the process according to claim 4$. or claim 47, as a 

40 cis-ragulatory sequence of replication or 

transcription of a neighboring sequence of DNA. 

50. Process according to claim 42, characterized 
by the fact that the proteins obtained have the 
capacity to modify the transcription activity, the 

45 replication, or the stability of DNA. 

51 . Utilization of a protein obtained by the process 
according to claim 4&, to modify the transcription, 
replication or stability of a sequence of DNA in a cell 
containing this sequence of DNA and.expressmg 

50 this protein. 

(52) Process for tr\e production of RNA, 
chiTacte rized by the fact that, in the same milieu, 
genes which are at least partially composed of 
. synthetic stochastic polynucleotides are produced 
55 eimuttaneouslv. that the g e nes thus sh U inH ar L. 
introduced in ho sl m ill 1r 1 tQ orodlTr * nru - 

ensembie of transformed host e^ Us^hatthe 
' independent clones of transformed host cells so 
• produced are grown simultaneously^hanhff ^ 
60 screening and/or selection is carried ouVcj vth£ 
Ensemble in a manne r tn identify innse host cells_ 
containing ttochastjc ^ygncM of RNA having^t. 
least one desired propert y , and that the RNA is 
isolated from the cultures of hn^t cells so id en t ified. 
65 53. Process according to claim 52. characterized 
by the fact that the said property is the capacity to 
bind to a given compound- # . 

54. Process according to daim 52. characterized 
by the fad that the said property is the capacity t 

70 catalyse a given chemical reaction. < 

55. Process according to claim 52. characterized 
by the fact that the said property is to be a transfer 
RNA. ' 



